A Seriation Approach for Visualization-Driven Discovery of Co-Expression Patterns in Serial Analysis of Gene Expression (SAGE) Data

نویسندگان

  • Olena Morozova
  • Vyacheslav Morozov
  • Brad G. Hoffman
  • Cheryl D. Helgason
  • Marco A. Marra
چکیده

BACKGROUND Serial Analysis of Gene Expression (SAGE) is a DNA sequencing-based method for large-scale gene expression profiling that provides an alternative to microarray analysis. Most analyses of SAGE data aimed at identifying co-expressed genes have been accomplished using various versions of clustering approaches that often result in a number of false positives. PRINCIPAL FINDINGS Here we explore the use of seriation, a statistical approach for ordering sets of objects based on their similarity, for large-scale expression pattern discovery in SAGE data. For this specific task we implement a seriation heuristic we term 'progressive construction of contigs' that constructs local chains of related elements by sequentially rearranging margins of the correlation matrix. We apply the heuristic to the analysis of simulated and experimental SAGE data and compare our results to those obtained with a clustering algorithm developed specifically for SAGE data. We show using simulations that the performance of seriation compares favorably to that of the clustering algorithm on noisy SAGE data. CONCLUSIONS We explore the use of a seriation approach for visualization-based pattern discovery in SAGE data. Using both simulations and experimental data, we demonstrate that seriation is able to identify groups of co-expressed genes more accurately than a clustering algorithm developed specifically for SAGE data. Our results suggest that seriation is a useful method for the analysis of gene expression data whose applicability should be further pursued.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis

Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...

متن کامل

Study of Gene Expression Signatures for the Diagnosis of Pediatric Acute Lymphoblastic Leukemia (ALL) Through Gene Expression Array Analyses

Background: Acute lymphoblastic leukemia (ALL) as the most common malignancy in children is associated with high mortality and significant relapse. Currently, the non-invasive diagnosis of pediatric ALL is a main challenge in the early detection of patients. In the present study, a systems biology approach was used through network-based analysis to identify the key candidate genes related to AL...

متن کامل

Enhanced Expression of Genes Involved in the Biosynthesis Pathway of Tanshinones in Tetraploid Plants of Salvia Officinalis L.

Extended Abstract Introduction and Objective: Polyploidy is one of the main factors in plant adaptation that can increase secondary metabolites production in plants. Salvia officinalis L. is a perennial plant from the Lamiaceae family with a long history of use in the medicinal industry. Tanshinones are crucial active compounds biosynthesized in Salvia. This study was aimed to analyze the expr...

متن کامل

Gene Class expression: analysis tool of Gene Ontology terms with gene expression data.

Serial analysis of gene expression (SAGE) technology produces large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in these gene sets. We present an interactive web-based tool, called Gene Class, which allows functional annotation of SAGE data using the Gene Ontology (GO) database. This tool performs sear...

متن کامل

Long serial analysis of gene expression for gene discovery and transcriptome profiling in the widespread marine coccolithophore Emiliania huxleyi.

The abundant and widespread coccolithophore Emiliania huxleyi plays an important role in mediating CO2 exchange between the ocean and the atmosphere through its impact on marine photosynthesis and calcification. Here, we use long serial analysis of gene expression (SAGE) to identify E. huxleyi genes responsive to nitrogen (N) or phosphorus (P) starvation. Long SAGE is an elegant approach for ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PLoS ONE

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2008